-
Notifications
You must be signed in to change notification settings - Fork 525
Semantic Rerank: Adds Semantic Rerank API #5445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
milismsft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to address the potential multiple background tasks related to the Interference object (and proper dispose of that task as well) :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NaluTripician LGTM, added the comments, that we discussed offline.
Microsoft.Azure.Cosmos/src/RequestOptions/SemanticRerankRequestOptions.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Resource/Container/ContainerInlineCore.cs
Outdated
Show resolved
Hide resolved
aayush3011
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Nalu
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| AuthorizationTokenType tokenType, | ||
| ITrace trace); | ||
|
|
||
| public abstract ValueTask AddInferenceAuthorizationHeaderAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO let's not overload the core types with inference specific?
| string containerLinkUri, | ||
| CancellationToken cancellationToken); | ||
|
|
||
| internal abstract Task<SemanticRerankResult> SemanticRerankAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed part of contract?
One option is to just inline implementation inside ContainerInlineCore.cs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed as ContainerInlineCore calls the generic CosmosClientContext class rather than it's implementation.
|
|
||
| // Create and configure HttpClient for inference requests. | ||
| HttpMessageHandler httpMessageHandler = CosmosHttpClientCore.CreateHttpClientHandler( | ||
| gatewayModeMaxConnectionLimit: client.DocumentClient.ConnectionPolicy.MaxConnectionLimit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's please isolate Inference settings/configurations.
| RuntimeConstants.MediaTypes.Json); | ||
|
|
||
| // Send the request and ensure success. | ||
| HttpResponseMessage responseMessage = await this.httpClient.SendAsync(message, cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reliability story? (ex: retries etc...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retries will come at a later date
| // Parse the rerank scores, latency, and token usage from the response. | ||
| return new SemanticRerankResult( | ||
| ParseRerankScores(responseJson["Scores"]), | ||
| responseJson.ContainsKey("latency") ? Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, object>>(responseJson["latency"].ToString()) : null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserialize to a shadow or internal type instead?
| return new SemanticRerankResult( | ||
| ParseRerankScores(responseJson["Scores"]), | ||
| responseJson.ContainsKey("latency") ? Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, object>>(responseJson["latency"].ToString()) : null, | ||
| responseJson.ContainsKey("token_usage") ? Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, object>>(responseJson["token_usage"].ToString()) : null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any new top level values will be missed right?
Pull Request Template
Description
This pull request introduces a new semantic reranking feature to the Azure Cosmos DB .NET SDK, enabling users to rerank documents using an inference service that leverages Azure Active Directory (AAD) authentication. The main changes include the addition of the
InferenceServiceclass, new API surface for semantic reranking, and appropriate integration into the SDK's authorization and client context infrastructure. Notably, this functionality is only available when using AAD authentication.Semantic Reranking Feature Integration:
InferenceServiceclass, which handles communication with the Cosmos DB Inference Service for semantic reranking, including HTTP client configuration, payload construction, and response handling. This service enforces AAD authentication and manages its own authorization and disposal.PREVIEW) or internal APISemanticRerankAsyncto theContainerclass, allowing users to rerank a list of documents based on a context/query string. This is implemented inContainerInlineCoreand routed through the client context. [1] [2]Authorization and Token Handling Updates:
AuthorizationTokenProviderabstraction and its implementations to support a new method,AddInferenceAuthorizationHeaderAsync, which is only valid for AAD-based token providers. Non-AAD providers throw aNotImplementedExceptionfor this method. [1] [2] [3] [4] [5] [6]Client Context and Resource Management:
ClientContextCoreandCosmosClientContextto manage the lifecycle of theInferenceService, including creation, caching, and disposal. Added methods for invoking semantic reranking and for retrieving or creating the inference service instance. [1] [2] [3] [4] [5] [6]Dependency Updates:
Azure.Identitypackage in the test project to support AAD authentication scenarios.Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #IssueNumber